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50% for the listing and 100% in the source form outputs. 

Form feeds are also converted to multiple line feeds if T 
is specified. (This can be suppressed by a patch mentioned 
in Appendix D.) 

The D option specifies source type output. Line 
numbers, addresses, instruction codes, address comments, and 
literal dumps are deleted. 

The E option enables the EAE instruction disassembly. 

DISASM output is paged at each new origin, under 
certain circumstances pages with very little information are 
output and paper is wasted. However, pagination at each 
origin permits removal of overlay and literal pages for 
comparison with the affected code and so aids interpretation 
of the code. 

3. Output Forms: 

ssssssassssssss 

It is assumed that the reader is familiar with PAL8 
source and listing formats. DISASM imitates these, in most 
cases the DISASM source output is not identifiable from an 
original PAL8 source except for the occurence of certain 
special neomonics used to make the output more easily under¬ 
stood. These special symbols are listed in Appendix A. The 
listing form adds line numbers and address comments to the 
usual PAL8 listing format. For easy editing, the line numbers 
are reset at the start of each page. The line numbers of the 
listing correspond to the positions in the source and may be 
used in editing the source. Because of the deletion of literal 
dumps, the page numbers do not correspond. 

in the absence of defined symbols, the listing form 
might appear: 


27 

12451 

4567 

JMS I 

2567 

28 

12452 

1257 

TAD 

. + 5 

29 

12453 

5454 

JMP I 

. + 1 

30 

12454 

1356 

TAD 

2556 


By the definition of the appropriate symbols, the same code 
might be listed: 


27 

12451 

4567 


JMS I 

(OUTPUT 

28 

12352 

1257 

RETURN, 

TAD 

SAVAC 

29 

12453 

5454 


JMP I 

. + 1 

30 

12454 

1356 


TAD 

2556 /EXIT 

or i 

better) 





27 

12451 

4567 


JMS 

OUTPUT 

28 

12452 

1257 

RETURN, 

TAD 

SAVAC 

29 

12453 

5454 


JMP I 

.+ 1 

30 

12454 

1356 


EXIT 
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OS-8 DISASM 


DECUS NO. 8-639 


I introduction: 

SSSSSSSSSSBSSS 

DISASM is a program to convert an absolute binary 
file and a user symbol table into listing type or source 
type output* In producing a listing# emphasis is placed 
on making the listing as understandable as possible, in 
producing a source type output# emphasis is placed on 
duplicating as closely as possible the way an advanced pro¬ 
grammer would write the source. 

DISASM will recreate direct off-page references# 
local and zero page literals# symbolic address and data 
tables# and suppressed origins. These features are 
invoked by suitable symbol definitions. Displaced addresses 
such as .+5 and SYMBOL-6 are a standard part of DISASM. 

DISASM can easily be used to create a source file 
when the binary tape and a listing are available. It is 
also designed for the disassembly of undocumented binary 
programs. The symbol table can be fully defined by inspection 
of a listing or gradually evolved from inspection of successive 
disassembly listings. 

included with DISASM is a program, SPLIT, for 
splitting large binary files into small segments for individual 
disassembly. 

2. Loading and Calling DISASM: 

rsszsraaaasissesssssaBSsaaias 

DISASM is loaded and called in the usual manner 
with no special qualifications. A typical sequence of loading 
instructions would appear: 

•R ABSLDR 
*PTR:I 

.SAVE SYS: DISASM 
.R DISASM 

♦OUTPUT.PA<SYMTBL.PA,BINO1,BIN02,BIND1/T/D/E 

The default output is LPT: but the T option must be 
used with most printers as# without it# the output contains 
tabulation codes. (The author has no line printer but uses 
the handler FORMAT under the device name LPT: Tabs should be 
set at the normal eight column intervals.) 

The first input must be a symbol table or null. The 
Symbol table extension must be specified. The default 
extension Is BN. Up to eight binary input files may be 
sped f i ed. 

The T option causes tabulation codes in the output 
to be converted to spaces. This increases file length about 
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All symbol definitions for DISASM are made according 
to the field in which they appear. Labels appearing in the 
label column and In the address (operand) column 
are only those defined for the current memory field. Addresses 
are output according to the following priority: 

1. exact symbol labels. 

2. current address with a displacement up to 7 

(.-7 to .+7). 

3. symbols with displacement up to 7. 

4. absolute octal addresses. 

in the listing form only, If a label has value equal 
to the instruction, it Is added as a comment* If no label 
In the current field has the correct value, but a symbol 
defined in some other field does, It will be printed. 

Disassembly of instructions can be suppressed for one 
or more locations by the use of special symbols. The Instruc¬ 
tion can then be interpreted as an address or data, if address 
Is specified, the symbol with smallest displacement (up to 7) 
in any field is entered In the operator column. Priority goes 
to the current field, if no symbol qualifies, an octal value 
Is output, in the last example, the special symbol /es=2454 
caused "EDIT" to be printed; EDIT was defined as 1356. 

4. DISASM Symbol Tables: 

sssasaasssresssnssssssia 

DISASM contains all the permanently defined symbols of 
PAL8 with the exception of IOT and OPR. it also contains a 
set of EAE symbols and the basic floating point symbols. The 
former are enabled by the E option and the latter by the 
definition of flpnt, the zero page location of the floating 
point routine entry address. FLPNT must be the first symbol 
defined. The location and structure of the permanent tables 
are specified In Appendix C. 

All of field 1 except the top page Is devoted to user 
symbols. Six words are used for each symbol, so DISASM has 
capacity for 661 symbols. Programs using more symbols are 
disassembled In segments using only part of the total symbol 
table at any given time. The author usually defines symbols 
In several files. Separate files are used to define those 
symbols which are clearly local to a segment of code. The 
files required for several segments are merged using PIP 
before those sections of code are disassembled. 

The user must define symbols for all labels, literals, 
and data or symbol items and tables. Special symbols are ilso 
available for introducing suppressed origins and altering 
Instructions. All these user symbols must be input In the 
first Input file specification, in one file. 
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The symbol file must start with a title. The first 
character In the file must be a V. Up to 33 characters 
are taken from the first line for the title. The rest of the 
line is discarded. 

If the floating point instructions are to be recognized, 
the second line must contain the definition FLPNT=n where 
n is the floating point indirect entry address. This definition 
serves only to define FINT=JMS I n. FLPNT should be redefined 
in each field in which it is used. 

Labels must be defined by the field in which they occur. 
All labels for the same field must be defined contiguously. 

The field declaration has the form /FlELD=n where n is an 
octal digit. The declaration sets pointers to the first symbol 
in each field. Repetition of the same field declaration is a 
fatal error. 

The labels or user symbols are usually strings of 
printing characters (including spaces) followed by an "=" 
and an octal value or address of not more than four digits. 

Note that the equal sign cannot be included in a symbol. 

The symbol will be stored, but will not print back. 

Normally, each symbol is defined on a different line. See 
Appendix B for the technical description of symbol definition. 
DISASM will accept up to ten characters for a symbol, it 
ignores any additional characters. It will ignore, not store, 
any symbols with zero value. The comment slash Is a legal 
character In symbols, but if it Immediately terminates a 
value, it is recognized as a comment initiator and the rest of 
the line is ignored, if it is desired to temporarily delete a 
symbol, this can be done by inserting "0/" following the " = **. 
With the exception of unusual user symbols and literal symbols, 
the file is acceptable to PAL8• The symbols not acceptable 
to PAL8 can be placed at the end of the file, separated 
from the others by "$=0". PAL8 can then be used with the 
N option to alphabetically list the symbols. 

5. Special Symbols: 

ssaessassaescsasssss 

It has been noted that FLPNT has special meaning 
when it is the first symbol and that /FIELD has special meaning. 
DISASM recognizes a number of other special symbols. 

All symbols starting with the character C are 
considered to be zero page literals. Under the D option, 
source type output, locations with labels starting with C 
and origins preceedlng them are deleted. 

All symbols starting with < are recognized as local, 
non-zero page, literals which may have been created by direct 
off-page references. The character < never appears in the 
output. If these symbols are referenced directly, < is changed 
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to (, but If referenced indirectly* the I and the < are 
deleted. Symbols starting with < are also recognized as local 
literals. These should be used where the value of the literal 
corresponds to a local or zero page address. As In the case 
of zero page literals, they are deleted from source type 
listings, but origins preceedlng local literals are replaced by 
the pseudo-operator PAGE. 

The symbol definitions /@SS=n and /@S=n suppress 
Instruction disassembly and cause DISASM to output symbols 
instead. The definition /«S=n causes the instruction at 
address n to be interpreted as an address or symbol and then 
restores disassembly at the next location. The symbol /@SS 
switches DISASM into symbol table mode. /§S can be used to 
switch DISASM for one location only or to terminate a table. 

The definitions /§NS=n and /§N=n switch DISASM to 
data mode In which the octal values are output In the operation 
column. /*NS is used to start a data table. /@N Is used to 
insert a single value or to terminate a table. The /@S 
symbols take precedence and can be used to insert labels into 
data tables. 

At times it is desired to alter the value of an 
instruction. I/O errors can be corrected and code altered. 

The definition /ei=m=n will cause disassembly of the code m 
at location n. in the listing form output, the instruction 
code in the third column Is not changed, but the output In 
subsequent columns corresponds to the code ra. 

The definition /«0=m=n causes output of a suppressed 
origin of value m at address n. The current address Is also 
reset. The new origin causes output of a new page. See the 
section on disassembling overlays. 

The special /@ symbols are executed as part of the 
symbol table look-up routine and the search continues 
following their Identification. Location of other symbols 
with the desired value terminates the search. Thus if a label 
is defined with the same value as a special symbol, the special 
symbol must preceed the label. Also if two labels are defined 
with the same value in the same field, the second will never be 
identified, it merely consumes table space and adds to 
the search time. 

6. Using DISASM with a Listing: 

DISASM was originally written to create source files 
for programs where source tapes acceptable to the system were 
not available. (FOCAL source was available only on DECtape.) 

The symbols are copied directly from the listing. 

The symbol table Is of some value, but it does not Indicate 
the field in which the label belongs. The meaning of the 
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literals Is identified from the text and the dumps are 
used as a check. 

It is possible to copy a symbol file from the listing 
and immediately output a final source file. The probability 
of typing errors and the relative ease with which they can be 
detected from a listing type output make it deslreable to 
first output the disassembly as a listing. The listing line 
numbers can also be used in editing the source output. 


7. Using DISASM with a Symbol Table: 

SSSSSSSSSSSSSS3SSSSSSSSSSBSSSSSSSSSSS 

When a symbol table is available* but no listing* and 
it is desired to create a listing or source, it is usually 
easiest to first obtain a disassembly with no symbol table or 
better with one that contains only a title and a dummy field 
setting. This listing can be used to identify the fields in 
which the symbols are defined and to define the literal symbols. 
Some data and symbol tables can be identified from the first 
pass. The existence of zero page literals is indicated by a 
dump.at the end of the code for a given field. The local or 
non-zero page literals are identified by an origin near the 
top of the page, code in every location to the top of the page, 
and normally another origin following the literal dump. 

The ldentification of the dump can only be confirmed by checking 
the sequence of references to the dump, but this is rarely 
needed, in the first disassembly, the dump might appear: 


*575 


4 

00575 

1426 

TAD I 

26 

5 

00576 

0453 

AND I 

53 

6 

00577 

0215 

AND 

415 


The next page would start with an origin. The following 
definitions would then be added to the symbol table: 


/•SS=575 

/•S=577 

<1426-575 

<453-576 

<215=577 

Suppressing disassembly and outputting symbols conceals the 
rare ocassions when literals consist of instructions 
such as <JMS OUTPUT, but contribute more to identifying their 
meaning. On the second listing, the dump might appear: 


3 

4 00575 

5 00576 

6 00577 


*575 

1426 <1426, ERROR+2 
0453 <453, OUTPUT 
0215 <215, 215 


<1426 can now be redefined as <ERR0R+2 and <453 as <OUTPUT. 
Note that the character < was used except in the case of <453 







453 is a current page address. Literals corresponding to 
current page or zero page addresses should be defined using (. 

If the data field is not current, TAD OUTPUT, obtained using 
<OUTPUT, Is not executed in the same manner as TAD I (OUTPUT 
obtained by defining (OUTPUT. 

7. Disassembling undocumented Tapes: 

in disassembling all but the shortest undocumented tapes, 
it is very valuable to use SPLIT. Frequent disassembly of short 
segments of the program with the symbol table freshly revised 
greatly assists interpretation of the code. 

Following the first disassembly, using a symbol table 
which contains only a title and a dummy field setting, all 
literal symbols should be defined as above. Dl SASM cannot 
tell the user what the code does, so there i s no alternative 
to tracing out the operation of the instructions. A quick 
scan of the first dump may reveal familiar I/O routines, USR 
calls, and similar routines which can quickly be labeled. In 
inventing new labels, it is important to avoid duplication. 

It is usually simplest to build the symbol table in segments. 
Segregate literals at the end. Editing and correcting the table 
will be easiest if the symbols are entered in order of increasing 
value within each segment. Use PAL8 with the N option to 
frequently list the symbols in alphabetical order. 

8. Overlays and Patches: 

Overlays and patches can be extremely difficult to 
interpret in their original location. If it is found that 
a section of code from 543 to 602 is to be moved to 200 to 


237, it is very difficult 
of an instruction such as 
address specified in 573. 
simplified by inserting a 
/@O=200 = 543. The address 
instruction will be 
for the overlay can 


to recognize the signifigance 
jmp I 430 as meaning jump to the 
interpretation can be greatly 
suppressed origin using the definition 
will then appear at 230 and the 
disassembled as JMP I 230. Labels 
then be defined using the translated 


values, in the interval 200-237. 

There are no special precautions required in inserting 
suppressed origins in DISASM listings. References to labels 
outside the ovelay, even to literals, will be correct. However, 
PAL8 imposes severe restrictions on the use of literals in 
code displaced by suppressed origins and in code on the 
same page but preceeding the suppressed origin. After 
interpretation of the overlay, it may be necesary to delete the 
suppressed origin and define logical expressions to translate 
the address references. These can later be edited into thr 
source output. 
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9. Floating Point Disassemblies: 

— — ss:s»3sssznsssB(ii»Bi:ixtax 

as noted previously, If the first symbol Is FLPNT=n, 
then JMS I n will be regarded as a command to enter floating 
point disassembly mode* If n«7, as Is common, then on every 
occurance of 5407, DISASM will enter this mode, if the entry 
is data, then all code until the next zero will be incorrectly 
disassembled. Once this has been spotted, It can be suppressed 
by using a /»S or /*N symbol to suppress the disassembly of 
the 5407 data Item. 

Certain programs, such as FOCAL use non-standard floating 
point operation codes. With these programs, redefinition of 
the codes requires modi fl cation of the floating point table 
at 2054. if fext is not zero, a major patch will be required 
since exit requires calling a special routine. See Appendix c 
for the structure of the tables. 

Special commands with the zero operation code such as 
square root have not been defined. They are not uniform in 
usage. 

10. DISASM Error Routines and Codes: 

S3SSSS8SSSSBSSS388S3SS86SSSSSSSSSfiC 

If the output file becomes full, DISASM prints "FULL" 
and calls the command decoder. The user should input only an 
output file, input files will be ignored. The first output 
file ends in an incomplete line, if merged using EDIT, no 
information is lost. If the first file is moved in anything 
except image mode, there is risk that the partial line will be 
lost. 

All other detected errors are fatal and output files 
are lost. The codes for the errors are: 

Output handler error: 0 
input handler error: 1 
Output open error: 2 
Output close error: 3 
Default LPT: not available: 4 
Symbol table error: 5 

ISASM attempts to recover from most symbol table 
errors. The only errors causing error exits are: 

Failure to start with a / (no title). 

Initial field not specified. 

Field specification repeated for the same field. 

Symbol table overflow. 







11. SPLIT: 


SPLIT is a simple, not very elegant, file splitting 
program. Loaded and called in the usual manner, it expects 
a four character output name and a single input file, if only 
an output device is specified, it will supply the name SPLT. 

SPLIT scans the input file and outputs it to a 
sequence of files sequentially numbered, eg. SPLTOl.BN, 
SPLT02.BN, ...» SPLT99.BN. Whatever the output name the user 
specifies, SPLIT supplies the fifth and sixth characters and 
the BN extension. 

Each time that SPLIT encounters an origin that is on 
a different memory page than the last, it closes the last file 
and inserts the new origin in a new file, if SPLIT has not 
seen any field setting, it does not insert any field setting. 
On the other hand, if it has seen a field setting, it starts 
every file with a field setting. 


Since SPLIT 
files before it can 
al1 output and then 
immediately fatal, 
but 4 represents an 


will have closed a number of output 
detect a check sum error, it finishes 
reports the error. Other errors are 
They include the same codes 0-3 as DISASM 
improper end of file. 
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APPENDIX A 
Symbols not In PAL8 

in addition to the symbols defined In PAL8, DISASM 
uses the following symbols which may have to be defined 
before the DISASM output can be reassembled. 


Micro instructions, 

STM2=7344 
STM3=7346 
STP2=730 5 


lways available: 

STP3=7325 

STP4=7307 

STP6=7327 


ST2K=7332 

ST4K*7330 

ST6K=7333 


eae instruction, 

MUY=7405 
DVI=7507 
NM1=7411 
SHL=7413 


available with 

ASR=7415 

LSR=7417 

MQL=7421 


the E option: 

SCL=7403 
SCA=7441 
M QA= 7501 


Floating Point Instructions, when FLPNT is the first symbol: 
Regular: FINT=JMS I FLPNT FEXT=0 

MRI instructions: 


FADD=1000 
FSUB=2000 
FNOR=70 0 0 


FMPY=30 0 0 
FDIV=4000 


FGET=50 0 0 
FPUT=60 0 0 
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APPENDIX B 
User Symbol input 

Considerable power has been Introduced into DISASM 
as It evolved by manipulation of symbols. Many such 
opportunities have already been formalized, but the 
user who understands the symbol input algorithms can still 
invent new tricks. The format of the input is much more 
flexible than indicated. The PAL8 symbol table can be used 
as input if a field setting is inserted and all spaces in the 
table are converted to tabs. The input routine first tests 
for a /. If this is found, the first line is accepted as a 
title (else fatal error). The input routine then requests 
the first definition and tests for FLPNT. If present, it is 
processed, and another definition requested, in either case, the 
routine then enters a loop to accept a definition, test for a 
/FIELD symbol, if recognized, set the pointers and go back 
for another definition, else test the value for zero. If zero, 
go back for another, else store it. 

The definition fetching routine calls a symbol fetching 
routine, if the symbol i s a /§ special symbol, a special routine 
is called to assemble the values for the special symbol. 

(Special symbols are flagged by the first word being zero. 

As they are identified, a check is also made for symbols starting 
99 and they are converted to 9A to avoid conflict.) The 
definition fetching routine then calls a subroutine to fetch the 
value. The special symbols are stored as a zero, then 
a negative symbol number, and then a special value if needed. 

The symbol fetching routine recogizes 275 (=) and all 
codes below 240 as terminators. When first called, the routine 
skips terminators. The first non-terminator starts the character 
scan. The first ten characters are packed as 6-bit chopped ASC II 
Additional characters are scanned and discarded until a termin¬ 
ator is found. If a terminator is found before the tenth char¬ 
acter, the storage is filled out with zeroes (9). In printing 
back symbols, DISASM ignores zeroes and so does not print back 
the character 9. The normally available terminators are hori¬ 
zontal tab, line feed, vertical tab, form feed, carriage return, 
and equals. 

The value accepting routine skips symbol terminators. 

The first non-sysrabol-terminator initiates value acceptance. 

Only octal digits are accepted. Up to four of these are con¬ 
verted to a binary value. Any other character terminates the 
value. In case there are four octal digits, the next character 
is fetched for use as a terminator. The terminator is tested 
for /, and if found, the text to the next carriage return is 
dumped. 


The definition "SYM«1“ followed by a carriage return 
will pack 2431,1600,0,0,0 for the symbol and 0001 for the value. 
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The definition "SYM=AB" will cause the Input value to be started 
and terminated by the "A". The value will be zero and SYM 
will not be stored. The Input routine will then try to fetch a 
value for the symbol "B".. Thus: 

SYM=AB 
NEXT®5 

will store only "EXT" with a value of 5. The "A" and the "N" are 
lost as value terminators* This is the result of the routine 
to recognize the comment slash. 


V 






13 


APPENDIX C 

DISASM Permanent Tables 

Two different types of tables are used In DISASM. 

The first contains instruction codes each followed by the 
two word c4 chopped character) neomonlc. Padding is done with 
spaces in the permanent tables* The first contains entries in 
an order such that combined micro-instructions precede their 
components. They are searched under a mask and matching 
bits are deleted under the mask. These tables are terminated by 
zero entries. The second type Is accessed by displacement Into 
the table and contains only the neomonlc. 

I/O Tables: These are accessed indirectly using a displacement 
address table. The device number (bits 3-8) Is used as a 
displacement into a table at 2400. zero entries Indicate that 
the device is not defined. Non-zero entries are addresses of 
tables of the first type containing the micro Instructions for 
the device. The device tables extend from 3263 to 3466. 3467 

to 3577 Is available for expansion, if more space is required, 
the user might consider disabling the eae or floating point 
commands to obtain more space. Provided that the table at 2400 
is properly maintained, rearrangement of the I/O table area Is 
slmpl e. 


Micro instructions: The tables for Group 2 (3033), Group 1 
(3160) and the eae (3116) instructions are of the same format 
as the I/O tables. The eae instruction table is easily 
altered. The Group 2 table has a special first entry wich must 
be preserved. 

Operands: The regular operands (AND, TAD, ISZ, DCA, jms, and 
JMP) are in a displacement table at 3102. The floating point 
operands are in a displacement table at 2054. Since this table 
does not contain a zero entry, Its base is 2052. 
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APPENDIX D 
Possible Patches 

A few hints are provided on patching DlSASM for special 
applications. 

Output control: Output Is designed for the OS-8 handler FORMAT* 
This handler Is used under the device name LPT: on the system 
under which DISASM was developed, it may desired to complement 
the T option or to change the relation of the expansion of 
tabs and form feeds. As loaded* DISASM Is set for both tab 
and form feed conversion. The T option patches the code to 
suppress conversion. The option can be complemented by changing 
location 3651 from SZA CLA to SNA CLA (7650). If form feed 
terminals are standard* changing 3652 to 5255 will cause the 
program to always output 214 codes* T option or not. Changing 
3654 to 7200 would cause tabs to be always converted. 

Special Micro instructions: The STM2 and similar combined 
micro-instructions are included to improve interpretation. They 
can be deleted by replacing thier operation codes by 4000. (Zero 
would terminate the table.) The new DEC symbols cannot be 
implemented as they are too long for the operation output routine 

Alteration of the floating point MRI instructions 
requires only changing the neomonics In the table. Changing 
FEXT would require reassembly (get the listing and use 
DISASM to disassemble DISASM). 

Suppressing literal recognition does not require a patch. 
Prefix the literal with a. 


Patch to correct internal buffer overflow problem: 


LOC 

LOC 


00127 

00226 


Change to 7601 (was vacant) 
Change to 1127 (was 1173) 





